\section{Constants}

Humans, including programmers, often use round numbers like 10, 100, 1000, 
in real life as well as in the code.

The practicing reverse engineer usually know them well in hexadecimal representation:
10=0xA, 100=0x64, 1000=0x3E8, 10000=0x2710.

The constants \TT{0xAAAAAAAA} (0b10101010101010101010101010101010) and \\
\TT{0x55555555} (0b01010101010101010101010101010101)  are also popular---those
are composed of alternating bits.

That may help to distinguish some signal from a signal where all bits are turned on (0b1111 \dots) or off (0b0000 \dots).
For example, the \TT{0x55AA} constant
is used at least in the boot sector, \ac{MBR}, 
and in the \ac{ROM} of IBM-compatible extension cards.

Some algorithms, especially cryptographical ones use distinct constants, which are easy to find
in code using \IDA.

\myindex{MD5}
\newcommand{\URLMD}{http://go.yurichev.com/17111}

For example, the MD5\footnote{\href{\URLMD}{wikipedia}} algorithm initializes its own internal variables like this:

\begin{verbatim}
var int h0 := 0x67452301
var int h1 := 0xEFCDAB89
var int h2 := 0x98BADCFE
var int h3 := 0x10325476
\end{verbatim}

If you find these four constants used in the code in a row, it is highly probable that this function is related to MD5.

\par Another example are the CRC16/CRC32 algorithms, 
whose calculation algorithms often use precomputed tables like this one:

\begin{lstlisting}[caption=linux/lib/crc16.c,style=customc]
/** CRC table for the CRC-16. The poly is 0x8005 (x^16 + x^15 + x^2 + 1) */
u16 const crc16_table[256] = {
	0x0000, 0xC0C1, 0xC181, 0x0140, 0xC301, 0x03C0, 0x0280, 0xC241,
	0xC601, 0x06C0, 0x0780, 0xC741, 0x0500, 0xC5C1, 0xC481, 0x0440,
	0xCC01, 0x0CC0, 0x0D80, 0xCD41, 0x0F00, 0xCFC1, 0xCE81, 0x0E40,
	...
\end{lstlisting}

See also the precomputed table for CRC32: \myref{sec:CRC32}.

In tableless CRC algorithms well-known polynomials are used, for example, 0xEDB88320 for CRC32.

\subsection{Magic numbers}
\label{magic_numbers}

\newcommand{\FNURLMAGIC}{\footnote{\href{http://go.yurichev.com/17112}{wikipedia}}}

A lot of file formats define a standard file header where a \IT{magic number(s)}\FNURLMAGIC{} is used, single one or even several.

\myindex{MS-DOS}

For example, all Win32 and MS-DOS executables start with the two characters \q{MZ}\footnote{\href{http://go.yurichev.com/17113}{wikipedia}}.

\myindex{MIDI}

At the beginning of a MIDI file the \q{MThd} signature must be present. 
If we have a program which uses MIDI files for something,
it's very likely that it must check the file for validity by checking at least the first 4 bytes.

This could be done like this:
(\IT{buf} points to the beginning of the loaded file in memory)

\begin{lstlisting}[style=customasmx86]
cmp [buf], 0x6468544D ; "MThd"
jnz _error_not_a_MIDI_file
\end{lstlisting}

\myindex{\CStandardLibrary!memcmp()}
\myindex{x86!\Instructions!CMPSB}

\dots or by calling a function for comparing memory blocks like \TT{memcmp()} or any other equivalent code
up to a \TT{CMPSB} (\myref{REPE_CMPSx}) instruction.

When you find such point you already can say where the loading of the MIDI file starts,
also, we could see the location
of the buffer with the contents of the MIDI file, what is used from the buffer, and how.

\subsubsection{Dates}

\myindex{UFS2}
\myindex{FreeBSD}
\myindex{HASP}

Often, one may encounter number like \TT{0x19870116}, which is clearly looks like a date (year 1987, 1th month (January), 16th day).
This may be someone's birthday (a programmer, his/her relative, child), or some other important date.
The date may also be written in a reverse order, like \TT{0x16011987}.
American-style dates are also popular, like \TT{0x01161987}.

Well-known example is \TT{0x19540119} (magic number used in UFS2 superblock structure), which is a birthday of Marshall Kirk McKusick, prominent FreeBSD contributor.

\myindex{Stuxnet}
Stuxnet uses the number ``19790509'' (not as 32-bit number, but as string, though), and this led to speculation
that the malware is connected to Israel
\footnote{This is a date of execution of Habib Elghanian, persian jew.}

Also, numbers like those are very popular in amateur-grade cryptography, for example, excerpt from the \IT{secret function} internals from HASP3 dongle
\footnote{\url{https://web.archive.org/web/20160311231616/http://www.woodmann.com/fravia/bayu3.htm}}:

\begin{lstlisting}[style=customc]
void xor_pwd(void) 
{ 
	int i; 
	
	pwd^=0x09071966;
	for(i=0;i<8;i++) 
	{ 
		al_buf[i]= pwd & 7; pwd = pwd >> 3; 
	} 
};

void emulate_func2(unsigned short seed)
{ 
	int i, j; 
	for(i=0;i<8;i++) 
	{ 
		ch[i] = 0; 
		
		for(j=0;j<8;j++)
		{ 
			seed *= 0x1989; 
			seed += 5; 
			ch[i] |= (tab[(seed>>9)&0x3f]) << (7-j); 
		}
	} 
}
\end{lstlisting}

\subsubsection{DHCP}

This applies to network protocols as well.
For example, the DHCP protocol's network packets contains the so-called \IT{magic cookie}: \TT{0x63538263}.
Any code that generates DHCP packets somewhere must embed this constant into the packet.
If we find it in the code we may find where this happens and, not only that.
Any program which can receive DHCP packet must verify the \IT{magic cookie}, comparing it with the constant.

For example, let's take the dhcpcore.dll file from Windows 7 x64 and search for the constant.
And we can find it, twice:
it seems that the constant is used in two functions with descriptive names\\
\TT{DhcpExtractOptionsForValidation()} and \TT{DhcpExtractFullOptions()}:

\begin{lstlisting}[caption=dhcpcore.dll (Windows 7 x64),style=customasmx86]
.rdata:000007FF6483CBE8 dword_7FF6483CBE8 dd 63538263h          ; DATA XREF: DhcpExtractOptionsForValidation+79
.rdata:000007FF6483CBEC dword_7FF6483CBEC dd 63538263h          ; DATA XREF: DhcpExtractFullOptions+97
\end{lstlisting}

And here are the places where these constants are accessed:

\begin{lstlisting}[caption=dhcpcore.dll (Windows 7 x64),style=customasmx86]
.text:000007FF6480875F  mov     eax, [rsi]
.text:000007FF64808761  cmp     eax, cs:dword_7FF6483CBE8
.text:000007FF64808767  jnz     loc_7FF64817179
\end{lstlisting}

And:

\begin{lstlisting}[caption=dhcpcore.dll (Windows 7 x64),style=customasmx86]
.text:000007FF648082C7  mov     eax, [r12]
.text:000007FF648082CB  cmp     eax, cs:dword_7FF6483CBEC
.text:000007FF648082D1  jnz     loc_7FF648173AF
\end{lstlisting}

\subsection{Specific constants}

Sometimes, there is a specific constant for some type of code.
For example, the author once dug into a code, where number 12 was encountered suspiciously often.
Size of many arrays is 12, or multiple of 12 (24, etc).
As it turned out, that code takes 12-channel audio file at input and process it.

And vice versa: for example, if a program works with text field which has length of 120 bytes,
there has to be a constant 120 or 119 somewhere in the code.
If UTF-16 is used, then $2 \cdot 120$.
If a code works with network packets of fixed size, it's good idea to search for this constant in the code as well.

This is also true for amateur cryptography (license keys, etc).
If encrypted block has size of $n$ bytes, you may want to try to find occurences of this number throughout the code.
Also, if you see a piece of code which is been repeated $n$ times in loop during execution,
this may be encryption/decryption routine.

\subsection{Searching for constants}

It is easy in \IDA: Alt-B or Alt-I.
\myindex{binary grep}
And for searching for a constant in a big pile of files, or for searching in non-executable files,
there is a small utility called \IT{binary grep}\footnote{\BGREPURL}.

